* Cleaning of Mexican micro data from EIA

* cleaning steps:
	* (1) Drop if data on materials or sales is missing (needed to establish whether importer / exporter)
	* (2) Drop firms with unrealistic year-to-year growth in sales or total inputs

use "input/EIA.dta", clear
do "dofiles/constructvariables.do"
destring year, replace
destring id, replace
egen id2 = group(id)
sort id2 year

* Add nominal exchange rate for Mexico
sort year
merge year using "input/ERmexico.dta" // domestic currency per U.S. dollar, quarterly frequency, averaged by year.

gen mat = matdom + matfor
gen inputs = mat + wages
gen matusd = mat/ er
gen salesusd = salestot/er
gen inputusd = inputs/ er

xtset id2 year, yearly
order id2, first
order id, last

* Drop if data to establish whether importer / exporter is missing
drop if matdom==. | matfor==. |  salesdom ==. | exports==.

* Drop outliers
gen salesgrowth = (salesusd/l.salesusd-1)*100
gen inputgrowth = (inputusd/l.inputusd-1)*100
sort year id2
bys year: egen salesgrowth99 = pctile(salesgrowth), p(99)
gen I99=0
replace I99=1 if salesgrowth>salesgrowth99 & salesgrowth!=.
bys year: egen inputgrowth99 = pctile(inputgrowth), p(99)
gen I99m=0
replace I99m=1 if inputgrowth>inputgrowth99 & inputgrowth!=.
bys year: egen salesgrowth01 = pctile(salesgrowth), p(1) 
gen I01=0
replace I01=1 if salesgrowth<salesgrowth01 & salesgrowth!=. 
bys year: egen inputgrowth01 = pctile(inputgrowth), p(1)
gen I01m=0
replace I01m=1 if inputgrowth<inputgrowth01 & inputgrowth!=.
keep if I99==0
keep if I99m==0
keep if I01==0
keep if I01m==0
drop  I99 I99m I01 I01m

do "dofiles/constructvariables2.do"
save "temp/EIA_clean.dta", replace

